Unsupervised speaker segmentation with residual phase and MFCC features

نویسندگان

S. Jothilakshmi

Vennila Ramalingam

S. Palanivel

چکیده

This paper proposes an unsupervised method for improving the automatic speaker segmentation performance by combining the evidence from residual phase (RP) and mel frequency cepstral coefficients (MFCC). This method demonstrates the complementary nature of speaker specific information present in the residual phase in comparison with the information present in the conventional MFCC. Moreover this method presents an unsupervised speaker segmentation algorithm based on support vector machine (SVM). The experiments show that the combination of residual phase and MFCC helps to identify more robustly the transitions among speakers. 2009 Elsevier Ltd. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised Speaker Segmentation using Autoassociative Neural Network

In this paper we propose an unsupervised approach to speaker segmentation using autoassociative neural network (AANN). Speaker segmentation aims at finding speaker change points in a speech signal which is an important preprocessing step to audio indexing, spoken document retrieval and multi speaker diarization. The method extracts the speaker specific information from the Mel frequency cepstra...

متن کامل

Speaker Identification by Combining Various Vocal Tract and Vocal Source Features

Previously, we proposed a speaker recognition system using a combination of MFCC-based vocal tract feature and phase information which includes rich vocal source information. In this paper, we investigate the efficiency of combination of various vocal tract features (MFCC and LPCC) and vocal source features (phase and LPC residual) for normal-duration and short-duration utterance. The Japanese ...

متن کامل

Combining amplitude and phase-based features for speaker verification with short duration utterances

Due to the increasing use of fusion in speaker recognition systems, one trend of current research activity focuses on new features that capture complementary information to the MFCC (Mel-frequency cepstral coefficients) for improving speaker recognition performance. The goal of this work is to combine (or fuse) amplitude and phase-based features to improve speaker verification performance. Base...

متن کامل

LP Residual Features for Robust, Privacy-Sensitive Speaker Diarization

We present a comprehensive study of linear prediction residual for speaker diarization on single and multiple distant microphone conditions in privacy-sensitive settings, a requirement to analyze a wide range of spontaneous conversations. Two representations of the residual are compared, namely real-cepstrum and MFCC, with the latter performing better. Experiments on RT06eval show that residual...

متن کامل

Speaker2Vec: Unsupervised Learning and Adaptation of a Speaker Manifold Using Deep Neural Networks with an Evaluation on Speaker Segmentation

This paper presents a novel approach, we term Speaker2Vec, to derive a speaker-characteristics manifold learned in an unsupervised manner. The proposed representation can be employed in different applications such as diarization, speaker identification or, as in our evaluation test case, speaker segmentation. Speaker2Vec exploits large amounts of unlabeled training data and the assumption of sh...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Expert Syst. Appl.

دوره 36 شماره

صفحات -

تاریخ انتشار 2009

Unsupervised speaker segmentation with residual phase and MFCC features

نویسندگان

چکیده

منابع مشابه

Unsupervised Speaker Segmentation using Autoassociative Neural Network

Speaker Identification by Combining Various Vocal Tract and Vocal Source Features

Combining amplitude and phase-based features for speaker verification with short duration utterances

LP Residual Features for Robust, Privacy-Sensitive Speaker Diarization

Speaker2Vec: Unsupervised Learning and Adaptation of a Speaker Manifold Using Deep Neural Networks with an Evaluation on Speaker Segmentation

عنوان ژورنال:

اشتراک گذاری